Some Keyword-Based Characteristics for Evaluation of Thematic Structure of Multidisciplinary Documents

نویسندگان

  • Mikhail Alexandrov
  • Alexander Gelbukh
  • Pavel Makagonov
چکیده

The problem of classification of documents of complex interdisciplinary character with high level of informational noise is considered. The set of classification domains is supposed to be fixed. A domain is defined by an appropriate keyword list. Quantitative and qualitative characteristics, as well as visual presentations used for such classification are discussed. A program Text Recognizer based on these characteristics is presented.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation

Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...

متن کامل

Coronavirus: Discover the Structure of Global Knowledge, Hidden Patterns & Emerging Events

Background & Objective:  The present study aimed at exploring the structure of global knowledge, hidden patterns, and emerging Coronavirus events using co-word techniques. Co-word analysis is one of the most efficient scientific methods to analyze the structure and dynamics of knowledge and the general state of research.  Materials & Methods:  This applied research performed using Co-word anal...

متن کامل

Drawing Co-Citation Networks of Corona Virus Studies

Background and Aim: The purpose of the present study is to map the coronavirus domain citation network to better understand this domain based on all other citation networks.  Materials and Methods: The present study is applied in terms of purpose, and is descriptive scientometrics in terms of type, which has been done with the all-citation method. In this study, all scientific publications on ...

متن کامل

روش جدید متن‌کاوی برای استخراج اطلاعات زمینه کاربر به‌منظور بهبود رتبه‌بندی نتایج موتور جستجو

Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...

متن کامل

A Bibliometric Analysis of Hematological Research Productivity among Five Islamic Countries during 1996 to 2013 (a 17-years period)

Background: This study made an attempt to make the quantitative and qualitative evaluation of hematological research output in five Islamic countries Iran, Turkey, Malaysia, Saudi Arabia and Egypt which have the most scientific productions from 1996-2013. Materials and Methods: The current study was carried out during the 1st to 31st of September, 2014 in Blood Transfusion Research Center, Sh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010